Automatic Time Expression Labeling for English and Chinese Text

نویسندگان

  • Kadri Hacioglu
  • Ying Chen
  • Benjamin Douglas
چکیده

In this paper, we describe systems for automatic labeling of time expressions occurring in English and Chinese text as specified in the ACE Temporal Expression Recognition and Normalization (TERN) task. We cast the chunking of text into time expressions as a tagging problem using a bracketed representation at token level, which takes into account embedded constructs. We adopted a left-to-right, token-by-token, discriminative, deterministic classification scheme to determine the tags for each token. A number of features are created from a predefined context centered at each token and augmented with decisions from a rule-based time expression tagger and/or a statistical time expression tagger trained on different type of text data, assuming they provide complementary information. We trained one-versus-all multi-class classifiers using support vector machines. We participated in the TERN 2004 recognition task and achieved competitive results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Semantic Role Labeling for Chinese Verbs

Recent years have seen a revived interst in semantic parsing by applying statistical and machinelearning methods to semantically annotated corpora such as the FrameNet and the Proposition Bank. So far much of the research has been focused on English due to the lack of semantically annotated resources in other languages. In this paper, we report first results on semantic role labeling using a pr...

متن کامل

Labeling Chinese Predicates with Semantic Roles

In this article we report work on Chinese semantic role labeling, taking advantage of two recently completed corpora, the Chinese PropBank, a semantically annotated corpus of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate–argument structure of nominalized predicates. Because the semantic role labels are assigned to the constituents in a parse tree, we fi...

متن کامل

Automatic Prosodic Break Lab Chinese Speech

For corpus-based speech synthesis, large quantities of labeled speech are required. Manually labeling speech data is quite laborintensive. Therefore, automatic speech labeling is highly desired. Prosodic break detection is one of the tasks for automatic speech labeling. In the paper, we propose an automatic break detection algorithm for mandarin Chinese speech. In this approach, we use energy c...

متن کامل

Multilingual Sentiment and Subjectivity Analysis

Subjectivity and sentiment analysis focuses on the automatic identification of private states, such as opinions, emotions, sentiments, evaluations, beliefs, and speculations in natural language. While subjectivity classification labels text as either subjective or objective, sentiment classification adds an additional level of granularity, by further classifying subjective text as either positi...

متن کامل

Transliteration of Proper Names in Cross-Lingual Information Retrieval

We address the problem of transliterating English names using Chinese orthography in support of cross-lingual speech and text processing applications. We demonstrate the application of statistical machine translation techniques to “translate” the phonemic representation of an English name, obtained by using an automatic text-to-speech system, to a sequence of initials and finals, commonly used ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005